13 research outputs found

    A Semantic Problem Solving Environment for Integrative Parasite Research: Identification of Intervention Targets for Trypanosoma cruzi

    Get PDF
    Effective research in parasite biology requires analyzing experimental lab data in the context of constantly expanding public data resources. Integrating lab data with public resources is particularly difficult for biologists who may not possess significant computational skills to acquire and process heterogeneous data stored at different locations. Therefore, we develop a semantic problem solving environment (SPSE) that allows parasitologists to query their lab data integrated with public resources using ontologies. An ontology specifies a common vocabulary and formal relationships among the terms that describe an organism, and experimental data and processes in this case. SPSE supports capturing and querying provenance information, which is metadata on the experimental processes and data recorded for reproducibility, and includes a visual query-processing tool to formulate complex queries without learning the query language syntax. We demonstrate the significance of SPSE in identifying gene knockout targets for T. cruzi. The overall goal of SPSE is to help researchers discover new or existing knowledge that is implicitly present in the data but not always easily detected. Results demonstrate improved usefulness of SPSE over existing lab systems and approaches, and support for complex query design that is otherwise difficult to achieve without the knowledge of query language syntax

    The steady-state transcriptome of the four major life-cycle stages of Trypanosoma cruzi

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Chronic chagasic cardiomyopathy is a debilitating and frequently fatal outcome of human infection with the protozoan parasite, <it>Trypanosoma cruzi</it>. Microarray analysis of gene expression during the <it>T. cruzi </it>life-cycle could be a valuable means of identifying drug and vaccine targets based on their appropriate expression patterns, but results from previous microarray studies in <it>T. cruzi </it>and related kinetoplastid parasites have suggested that the transcript abundances of most genes in these organisms do not vary significantly between life-cycle stages.</p> <p>Results</p> <p>In this study, we used whole genome, oligonucleotide microarrays to globally determine the extent to which <it>T. cruzi </it>regulates mRNA relative abundances over the course of its complete life-cycle. In contrast to previous microarray studies in kinetoplastids, we observed that relative transcript abundances for over 50% of the genes detected on the <it>T. cruzi </it>microarrays were significantly regulated during the <it>T. cruzi </it>life-cycle. The significant regulation of 25 of these genes was confirmed by quantitative reverse-transcriptase PCR (qRT-PCR). The <it>T. cruzi </it>transcriptome also mirrored published protein expression data for several functional groups. Among the differentially regulated genes were members of paralog clusters, nearly 10% of which showed divergent expression patterns between cluster members.</p> <p>Conclusion</p> <p>Taken together, these data support the conclusion that transcript abundance is an important level of gene expression regulation in <it>T. cruzi</it>. Thus, microarray analysis is a valuable screening tool for identifying stage-regulated <it>T. cruzi </it>genes and metabolic pathways.</p

    A unified framework for managing provenance information in translational research

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>A critical aspect of the NIH <it>Translational Research </it>roadmap, which seeks to accelerate the delivery of "bench-side" discoveries to patient's "bedside," is the management of the <it>provenance </it>metadata that keeps track of the origin and history of data resources as they traverse the path from the bench to the bedside and back. A comprehensive provenance framework is essential for researchers to verify the quality of data, reproduce scientific results published in peer-reviewed literature, validate scientific process, and associate trust value with data and results. Traditional approaches to provenance management have focused on only partial sections of the translational research life cycle and they do not incorporate "domain semantics", which is essential to support domain-specific querying and analysis by scientists.</p> <p>Results</p> <p>We identify a common set of challenges in managing provenance information across the <it>pre-publication </it>and <it>post-publication </it>phases of data in the translational research lifecycle. We define the semantic provenance framework (SPF), underpinned by the Provenir upper-level provenance ontology, to address these challenges in the four stages of provenance metadata:</p> <p>(a) Provenance <b>collection </b>- during data generation</p> <p>(b) Provenance <b>representation </b>- to support interoperability, reasoning, and incorporate domain semantics</p> <p>(c) Provenance <b>storage </b>and <b>propagation </b>- to allow efficient storage and seamless propagation of provenance as the data is transferred across applications</p> <p>(d) Provenance <b>query </b>- to support queries with increasing complexity over large data size and also support knowledge discovery applications</p> <p>We apply the SPF to two exemplar translational research projects, namely the Semantic Problem Solving Environment for <it>Trypanosoma cruzi </it>(<it>T.cruzi </it>SPSE) and the Biomedical Knowledge Repository (BKR) project, to demonstrate its effectiveness.</p> <p>Conclusions</p> <p>The SPF provides a unified framework to effectively manage provenance of translational research data during pre and post-publication phases. This framework is underpinned by an upper-level provenance ontology called Provenir that is extended to create domain-specific provenance ontologies to facilitate provenance interoperability, seamless propagation of provenance, automated querying, and analysis.</p

    High Throughput Selection of Effective Serodiagnostics for Trypanosoma cruzi infection

    Get PDF
    The diagnosis of Trypanosoma cruzi infection (the cause of human Chagas disease) is difficult because the symptoms of the infection are often absent or non-specific, and because the parasites themselves are usually below the level of detection in the infected subjects. Therefore, diagnosis generally depends on the measurement of T. cruzi–specific antibodies produced in response to the infection. However, current methods to detect anti–T. cruzi antibodies are relatively poor. In this study, we have conducted a broad screen of >400 T. cruzi proteins to identify those proteins which are best able to detect anti–T. cruzi antibodies. Using a set of proteins selected by this screen, we were able to detect 100% of >100 confirmed positive human cases of T. cruzi infection, as well as suspect cases that were negative using existing tests. This protein panel was also able to detect apparent changes in infection status following drug treatment of individuals with chronic T. cruzi infection. The results of this study should allow for significant improvements in the detection of T. cruzi infection and better screening methods to avoid blood transfusion–related transmission of the infection, and offer a crucial tool for determining the success or failure of drug treatment and other intervention strategies to limit the impact of Chagas disease

    Widespread, focal copy number variations (CNV) and whole chromosome aneuploidies in Trypanosoma cruzi strains revealed by array comparative genomic hybridization

    No full text
    Background: Trypanosoma cruzi is a protozoan parasite and the etiologic agent of Chagas disease, an important public health problem in Latin America. T. cruzi is diploid, almost exclusively asexual, and displays an extraordinarily diverse population structure both genetically and phenotypically. Yet, to date the genotypic diversity of T. cruzi and its relationship, if any, to biological diversity have not been studied at the whole genome level. Results In this study, we used whole genome oligonucleotide tiling arrays to compare gene content in biologically disparate T. cruzi strains by comparative genomic hybridization (CGH). We observed that T. cruzi strains display widespread and focal copy number variations (CNV) and a substantially greater level of diversity than can be adequately defined by the current genetic typing methods. As expected, CNV were particularly frequent in gene family-rich regions containing mucins and trans-sialidases but were also evident in core genes. Gene groups that showed little variation in copy numbers among the strains tested included those encoding protein kinases and ribosomal proteins, suggesting these loci were less permissive to CNV. Moreover, frequent variation in chromosome copy numbers were observed, and chromosome-specific CNV signatures were shared by genetically divergent T. cruzi strains. Conclusions The large number of CNV, over 4,000, reported here uphold at a whole genome level the long held paradigm of extraordinary genome plasticity among T. cruzi strains. Moreover, the fact that these heritable markers do not parse T. cruzi strains along the same lines as traditional typing methods is strongly suggestive of genetic exchange playing a major role in T. cruzi population structure and biology.Science, Faculty ofZoology, Department ofNon UBCReviewedFacult

    Screenshot of the Cuebee interface after formulation of the query 1, “List the genes that are downregulated in the epimastigote stage and exist in a single metabolic pathway.”

    No full text
    <p>Each row contains triples that are required to formulate the query. The query formulation is initiated by first selecting the server (PE All Datasets). After selecting the dataset, users begin to type in the search field and Cuebee provides suggestions matching the first letters typed in a drop down list. In this case “Microarray Analysis” is selected, and the query was limited to microarray analysis data pertaining to only “epimastigote” lifecycle stage of the parasite using filtering function of Cuebee. The triples are then extended as shown to achieve the desired query. The query uses “Group by” function of Cuebee to group all the epimastigote genes associated with a single metabolic pathway and “Refine” function to identify only those genes from the group that are downregulated; i.e, with log2 ratio less than −1. Specific results show a part of the results that include gene information from microarray lab data and pathway information from KEGG where each pathway ID represents specific pathway in KEGG.</p

    Screenshot of the Cuebee interface for query formulation.

    No full text
    <p>The row contains a triple (subject-predicate-object) that is required to formulate the query. The expressions over arrows represent relationships (predicates) that link the subject and object. The query formulation is initiated by first selecting the server (PE All Datasets). If the users know which particular datasets will be used for the query, they can select dataset there, such as microarray dataset, gene knockout dataset, etc. However, if the users are not sure about this, then they can select PE All Datasets, and Cuebee will try to find answers using all the datasets in Parasite Knowledge Base (PKB). Users then begin to type in the search field and Cuebee provides suggestions matching the first letters typed in a drop-down list. In this case “Microarray Analysis” is selected. The users can select specific instance of Microarray Analysis if known. Else, users can select “any_Microarray_analysis”. This will let Cuebee find answers using all the microarray data. Cuebee provides definitions on each concept (under Class Description) and more information about relationships (under Relations) as shown for the concept “gene” in this figure. Relationships that have asterix in front means that they are directly associated with the concept “gene” where “gene” acts as a subject of the triple. This information comes from the ontology, PEO in this case. Once the desired query is formulated, the users can click on Search and Cuebee will provide results under Specific results or General results section. Users can also query on the results of their first query using Refine button. The video demo on querying using Cuebee is available at: <a href="http://wiki.knoesis.org/index.php/Manuscript_Details" target="_blank">http://wiki.knoesis.org/index.php/Manuscript_Details</a>.</p

    New Web forms that store the data in a RDF subject-predicate-object (i.e., triple) format providing opportunity to relate the data to ontology concepts.

    No full text
    <p>Storing the data using these Web forms has no impact on the front-end user experience, but it offers extended querying functionality through the use of ontology concepts. Provenance information added through these Web forms is instantly available for querying.</p
    corecore